59 research outputs found

    Alignment of the UMLS semantic network with BioTop: Methodology and assessment

    Get PDF
    Motivation: For many years, the Unified Medical Language System (UMLS) semantic network (SN) has been used as an upper-level semantic framework for the categorization of terms from terminological resources in biomedicine. BioTop has recently been developed as an upper-level ontology for the biomedical domain. In contrast to the SN, it is founded upon strict ontological principles, using OWL DL as a formal representation language, which has become standard in the semantic Web. In order to make logic-based reasoning available for the resources annotated or categorized with the SN, a mapping ontology was developed aligning the SN with BioTop. Methods: The theoretical foundations and the practical realization of the alignment are being described, with a focus on the design decisions taken, the problems encountered and the adaptations of BioTop that became necessary. For evaluation purposes, UMLS concept pairs obtained from MEDLINE abstracts by a named entity recognition system were tested for possible semantic relationships. Furthermore, all semantic-type combinations that occur in the UMLS Metathesaurus were checked for satisfiability. Results: The effort-intensive alignment process required major design changes and enhancements of BioTop and brought up s

    Identifying and classifying biomedical perturbations in text

    Get PDF
    Molecular perturbations provide a powerful toolset for biomedical researchers to scrutinize the contributions of individual molecules in biological systems. Perturbations qualify the context of experimental results and, despite their diversity, share properties in different dimensions in ways that can be formalized. We propose a formal framework to describe and classify perturbations that allows accumulation of knowledge in order to inform the process of biomedical scientific experimentation and target analysis. We apply this framework to develop a novel algorithm for automatic detection and characterization of perturbations in text and show its relevance in the study of gene–phenotype associations and protein–protein interactions in diabetes and cancer. Analyzing perturbations introduces a novel view of the multivariate landscape of biological systems

    The CALBC Silver Standard Corpus for Biomedical Named Entities - A Study in Harmonizing the Contributions from Four Independent Named Entity Taggers

    Get PDF
    The production of gold standard corpora is time-consuming and costly. We propose an alternative: the 'silver standard corpus' (SSC), a corpus that has been generated by the harmonisation of the annotations that have been delivered from a selection of annotation systems. The systems have to share the type system for the annotations and the harmonisation solution has use a suitable similarity measure for the pair-wise comparison of the annotations. The annotation systems have been evaluated against the harmonised set (630.324 sentences, 15, 956, 841 tokens). We can demonstrate that the annotation of proteins and genes shows higher diversity across all used annotation solutions leading to a lower agreement against the harmonised set in comparison to the annotations of diseases and species. An analysis of the most frequent annotations from all systems shows that a high agreement amongst systems leads to the selection of terms that are suitable to be kept in the harmonised set. This is the first large-scale approach to generate an annotated corpus from automated annotation systems. Further research is required to understand, how the annotations from different systems have to be combined to produce the best annotation result for a harmonised corpus

    A core ontology of macroscopic stuff

    Get PDF
    Domain ontologies contain representations of types of stuff (matter, mass, or substance), such as milk, alcohol, and mud, which are represented in a myriad of ways that are neither compatible with each other nor do they follow a structured approach within the domain ontology. Foundational ontologies and Ontology distinguish between pure stuff and mixtures only, if it contains stuff. We aim to fill this gap between foundational and domain ontologies by applying the notion of a `bridging' core ontology, being an ontology of categories of stuff that is formalised in OWL. This core ontology both refines the DOLCE and BFO foundational ontologies and resolves the main type of interoperability issues with stuffs in domain ontologies, thereby also contributing to better ontology quality. Modelling guidelines are provided to facilitate the Stuff Ontology's use

    Patterns of nucleotide diversity at the regions encompassing the Drosophila insulin-like peptide (dilp) genes: demography vs positive selection in Drosophila melanogaster.

    Get PDF
    In Drosophila, the insulin-signaling pathway controls some life history traits, such as fertility and lifespan, and it is considered to be the main metabolic pathway involved in establishing adult body size. Several observations concerning variation in body size in the Drosophila genus are suggestive of its adaptive character. Genes encoding proteins in this pathway are, therefore, good candidates to have experienced adaptive changes and to reveal the footprint of positive selection. The Drosophila insulin-like peptides (DILPs) are the ligands that trigger the insulin-signaling cascade. In Drosophila melanogaster, there are several peptides that are structurally similar to the single mammalian insulin peptide. The footprint of recent adaptive changes on nucleotide variation can be unveiled through the analysis of polymorphism and divergence. With this aim, we have surveyed nucleotide sequence variation at the dilp1-7 genes in a natural population of D. melanogaster. The comparison of polymorphism in D. melanogaster and divergence from D. simulans at different functional classes of the dilp genes provided no evidence of adaptive protein evolution after the split of the D. melanogaster and D. simulans lineages. However, our survey of polymorphism at the dilp gene regions of D. melanogaster has provided some evidence for the action of positive selection at or near these genes. The regions encompassing the dilp1-4 genes and the dilp6 gene stand out as likely affected by recent adaptive events

    Reuse of terminological resources for efficient ontological engineering in Life Sciences

    Get PDF
    This paper is intended to explore how to use terminological resources for ontology engineering. Nowadays there are several biomedical ontologies describing overlapping domains, but there is not a clear correspondence between the concepts that are supposed to be equivalent or just similar. These resources are quite precious but their integration and further development are expensive. Terminologies may support the ontological development in several stages of the lifecycle of the ontology; e.g. ontology integration. In this paper we investigate the use of terminological resources during the ontology lifecycle. We claim that the proper creation and use of a shared thesaurus is a cornerstone for the successful application of the Semantic Web technology within life sciences. Moreover, we have applied our approach to a real scenario, the Health-e-Child (HeC) project, and we have evaluated the impact of filtering and re-organizing several resources. As a result, we have created a reference thesaurus for this project, named HeCTh

    Relating some stuff to other stuff

    Get PDF
    Traceability in food and medicine supply chains has to handle stuffs—entities such as milk and starch indicated with mass nouns—and their portions and parts that get separated and put together to make the final product. Implementations have underspecified ‘links’, if at all, and theoretical accounts from philosophy and in domain ontologies are incomplete as regards the relations involved. To solve this issue, we define seven relations for portions and stuff-parts, which are temporal where needed. The resulting theory distinguishes between the extensional and intensional level, and between amount of stuff and quantity. With application trade-offs, this has been implemented as an extension to the Stuff Ontology core ontology that now also imports a special purpose module of the Ontology of units of Measure for quantities. Although atemporal, some automated reasoning for traceability is still possible thanks to using property chains to approximate the relevant temporal aspects

    Drosophila Genes That Affect Meiosis Duration Are among the Meiosis Related Genes That Are More Often Found Duplicated

    Get PDF
    Using a phylogenetic approach, the examination of 33 meiosis/meiosis-related genes in 12 Drosophila species, revealed nine independent gene duplications, involving the genes cav, mre11, meiS332, polo and mtrm. Evidence is provided that at least eight out of the nine gene duplicates are functional. Therefore, the rate at which Drosophila meiosis/meiosis-related genes are duplicated and retained is estimated to be 0.0012 per gene per million years, a value that is similar to the average for all Drosophila genes. It should be noted that by using a phylogenetic approach the confounding effect of concerted evolution, that is known to lead to overestimation of the duplication and retention rate, is avoided. This is an important issue, since even in our moderate size sample, evidence for long-term concerted evolution (lasting for more than 30 million years) was found for the meiS332 gene pair in species of the Drosophila subgenus. Most striking, in contrast to theoretical expectations, is the finding that genes that encode proteins that must follow a close stoichiometric balance, such as polo, mtrm and meiS332 have been found duplicated. The duplicated genes may be examples of gene neofunctionalization. It is speculated that meiosis duration may be a trait that is under selection in Drosophila and that it has different optimal values in different species
    corecore